Post - Selection Inference
نویسندگان
چکیده
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to one of simultaneous inference. Simultaneity is required for all linear functions that arise as coefficient estimates in all submodels. By purchasing “simultaneity insurance” for all possible submodels, the resulting post-selection inference is rendered universally valid under all possible model selection procedures. This inference is therefore generally conservative for particular selection procedures, but it is always less conservative than full Scheffé protection. Importantly it does not depend on the truth of the selected submodel, and hence it produces valid inference even in wrong models. We describe the structure of the simultaneous inference problem and give some asymptotic results.
منابع مشابه
Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach
Here we present an expository, general analysis of valid post-selection or post-regularization inference about a low-dimensional target parameter, α, in the presence of a very high-dimensional nuisance parameter, η, which is estimated using modern selection or regularization methods. Our analysis relies on high-level, easy-to-interpret conditions that allow one to clearly see the structures nee...
متن کاملRecent Developments in Post-Selection Inference
It is common in modern applications to use data-dependent model selection tools to select a promising model before drawing inference over the parameters of the selected model. However, this simple series of steps conceals a significant fault that is often left unattended: the act of selection biases the distributions of test statistics and makes standard inference procedures unsound. This is re...
متن کاملPOST - SELECTION INFERENCE By Richard Berk
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to ...
متن کاملPOST - SELECTION INFERENCE By Richard
It is common practice in statistical data analysis to perform datadriven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to ...
متن کاملValid Post-Selection Inference
It is common practice in statistical data analysis to perform data-driven variable selection and derive statistical inference from the resulting model. Such inference enjoys none of the guarantees that classical statistical theory provides for tests and confidence intervals when the model has been chosen a priori. We propose to produce valid “post-selection inference” by reducing the problem to...
متن کامل